Goto

Collaborating Authors

 figure 2





Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

Neural Information Processing Systems

Project lead, main contributor, correspondence to alexandre.rame@isir.upmc.fr. Equal experimental contribution, order determined at random. Further information and resources related to this project can be found on this website.





1 Appendix 1 Bayes-by-backprop The Bayesian posterior neural network distribution P (w |D) is approximated

Neural Information Processing Systems

In Algorithm 1 we give the full clustering algorithm used for each of the T fixing iterations. In Figure 1 we show how the layers' In Figure 2 we show the impact of increasing the regularisation strength.